Outline

This HTML document contains the output of CATS-rf transcriptome assembly comparison tool. For more details on each table and figure, refer to the tool’s documentation.

General transcriptome assembly statistics

Table 1. General transcriptome assembly statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
N transcripts 27748 30651 36610 18601 18652 19217 18641
Total assembly length (bp) 34167218 32433743 27462499 44994186 44145594 42094487 45586004
N, % transcripts longer than 200 bp 23903, 86.14% 25617, 83.58% 28233, 77.12% 18277, 98.26% 18225, 97.71% 18132, 94.35% 18371, 98.55%
N, % transcripts longer than 500 bp 14553, 52.45% 13923, 45.42% 11423, 31.2% 16878, 90.74% 16738, 89.74% 16335, 85% 17019, 91.3%
N, % transcripts longer than 1000 bp 9887, 35.63% 8846, 28.86% 6486, 17.72% 13488, 72.51% 13356, 71.61% 12875, 67% 13620, 73.06%
N, % transcripts longer than 5000 bp 1036, 3.73% 914, 2.98% 608, 1.66% 2003, 10.77% 1931, 10.35% 1766, 9.19% 2024, 10.86%
N, % transcripts longer than 10000 bp 93, 0.34% 90, 0.29% 59, 0.16% 228, 1.23% 221, 1.18% 202, 1.05% 234, 1.26%
N, % transcripts longer than 20000 bp 4, 0.01% 6, 0.02% 6, 0.02% 10, 0.05% 10, 0.05% 6, 0.03% 10, 0.05%
Mean transcript length (bp) 1231.34 1058.16 750.14 2418.91 2366.8 2190.48 2445.47
Median trancript length (bp) 547.5 434 310 1756 1714 1575 1778
Transcript length IQR (bp) 258-1570 234-1238 206-641 926-3168 899-3096 762-2878 944-3211
Trancript length range (bp) 131-60159 131-58349 131-45657 131-50033 131-50033 131-49970 131-50248
N50 (bp) 2597 2388 1831 3590 3527 3401 3602
L50 3745 3737 3789 3836 3816 3748 3876
N90 (bp) 477 372 249 1215 1194 1135 1224
L90 14985 16939 22581 12212 12182 12096 12283
GC content (%) 49.17% 49.34% 49.68% 48.43% 48.45% 48.60% 48.37%
% of reads mapping to the assembly 97.71% 96.08% 89.99% 99.65% 99.53% 98.51% 99.67%
Parameter RSP_0.01_11_20 RSP_0.02_11_20
N transcripts 18719 19432
Total assembly length (bp) 45065633 43400943
N, % transcripts longer than 200 bp 18312, 97.83% 18210, 93.71%
N, % transcripts longer than 500 bp 16886, 90.21% 16573, 85.29%
N, % transcripts longer than 1000 bp 13517, 72.21% 13131, 67.57%
N, % transcripts longer than 5000 bp 1987, 10.61% 1872, 9.63%
N, % transcripts longer than 10000 bp 230, 1.23% 225, 1.16%
N, % transcripts longer than 20000 bp 12, 0.06% 11, 0.06%
Mean transcript length (bp) 2407.48 2233.48
Median trancript length (bp) 1749 1605
Transcript length IQR (bp) 914-3153.5 776.75-2950
Trancript length range (bp) 132-50317 132-50173
N50 (bp) 3576 3485
L50 3847 3794
N90 (bp) 1214 1160
L90 12234 12208
GC content (%) 48.39% 48.52%
% of reads mapping to the assembly 99.56% 98.64%

IQR = interquartile range

Assembly scores

Table 2. Transcript score component statistics and assembly scores.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
Coverage score component (mean, IQR) 0.571, 0.413-0.747 0.577, 0.425-0.75 0.586, 0.442-0.748 0.881, 0.839-1 0.884, 0.847-1 0.867, 0.852-1 0.896, 0.855-1
Accuracy score component (mean, IQR) 0.923, 0.906-0.944 0.882, 0.862-0.903 0.814, 0.785-0.837 0.941, 0.924-0.967 0.902, 0.889-0.923 0.781, 0.771-0.793 0.944, 0.924-0.972
Local fidelity score component (mean, IQR) 0.821, 0.698-1 0.758, 0.622-0.948 0.647, 0.5-0.816 0.969, 0.952-1 0.958, 0.938-1 0.923, 0.901-1 0.97, 0.956-1
Integrity score component (mean, IQR) 0.714, 0.493-1 0.685, 0.45-1 0.646, 0.408-1 0.92, 0.891-1 0.917, 0.89-1 0.898, 0.885-1 0.92, 0.888-1
Assembly score 0.349 0.3 0.219 0.757 0.72 0.582 0.771
Parameter RSP_0.01_11_20 RSP_0.02_11_20
Coverage score component (mean, IQR) 0.895, 0.862-1 0.871, 0.865-1
Accuracy score component (mean, IQR) 0.908, 0.893-0.93 0.778, 0.768-0.79
Local fidelity score component (mean, IQR) 0.962, 0.944-1 0.934, 0.914-1
Integrity score component (mean, IQR) 0.917, 0.889-1 0.903, 0.883-1
Assembly score 0.735 0.589

IQR = interquartile range

Figure 1. Transcript score distribution.

Coverage and accuracy statistics

Table 3. Coverage and accuracy statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
% of covered bases per transcript (mean, IQR) 98.65%, 100%-100% 98.67%, 100%-100% 98.77%, 100%-100% 98.79%, 100%-100% 98.81%, 100%-100% 97.09%, 100%-100% 98.81%, 100%-100%
N, % fully covered transcripts 23041, 83.04% 25437, 82.99% 30107, 82.24% 15050, 80.91% 15004, 80.44% 15305, 79.64% 15118, 81.1%
N, % fully uncovered transcripts 35, 0.13% 42, 0.14% 41, 0.11% 62, 0.33% 73, 0.39% 408, 2.12% 58, 0.31%
Mean coverage per transcript (mean, IQR) 6.34, 3.09-7.45 6.33, 3.19-7.19 6.42, 3.32-7.37 33.23, 17.91-40.3 33.5, 17.92-40.56 32.62, 17.33-40.25 42.79, 23.53-51.91
N, % bases with coverage equal to or higher than 5 23855056, 69.82% 23149363, 71.37% 20455832, 74.49% 41445454, 92.11% 40953073, 92.77% 39444554, 93.7% 42580551, 93.41%
N, % bases with coverage equal to or higher than 10 13364652, 39.12% 13280936, 40.95% 12657237, 46.09% 37737673, 83.87% 37415358, 84.75% 36437305, 86.56% 39786932, 87.28%
N, % bases with coverage equal to or higher than 20 4728428, 13.84% 4799585, 14.8% 4865699, 17.72% 29081511, 64.63% 28942101, 65.56% 28498282, 67.7% 33586234, 73.68%
N, % bases with coverage equal to or higher than 40 840307, 2.46% 878120, 2.71% 972668, 3.54% 15807418, 35.13% 15794777, 35.78% 15717903, 37.34% 19951692, 43.77%
N, % bases with coverage equal to or higher than 60 210452, 0.62% 209214, 0.65% 230581, 0.84% 9226481, 20.51% 9265959, 20.99% 9260002, 22% 13592696, 29.82%
N, % bases with coverage equal to or higher than 80 80643, 0.24% 79714, 0.25% 77844, 0.28% 5748049, 12.78% 5804791, 13.15% 5831676, 13.85% 8787932, 19.28%
N, % bases with coverage equal to or higher than 100 35875, 0.1% 35453, 0.11% 33673, 0.12% 3692371, 8.21% 3741056, 8.47% 3790823, 9.01% 6153487, 13.5%
Maximum uncovered region length per transcript (bp) (mean, IQR) 16.68, 0-0 14.38, 0-0 7.57, 0-0 25.44, 0-0 21.17, 0-0 19.57, 0-0 25.19, 0-0
Mean end coverage per transcript (mean, IQR) 1.88, 1-2.14 1.89, 1-2.1 1.86, 1-2.09 7.47, 2.61-10.41 7.55, 2.82-10.3 7.65, 3.04-10.38 9.53, 3.03-13.58
N, % assembly bases in LCR 6933935, 20.29% 6295509, 19.41% 4821704, 17.56% 2663500, 5.92% 2362921, 5.35% 1950060, 4.63% 2286107, 5.01%
% of bases in LCR per transcript (mean, IQR) 37.08%, 10.95%-55.91% 36.06%, 10.68%-53.55% 34.47%, 10.81%-50% 5.35%, 0%-4.15% 5.32%, 0%-3.8% 7.5%, 0%-3.52% 4.37%, 0%-3.3%
LCR length (bp) (mean, IQR) 68.69, 17-75 65.67, 17-70 55.02, 18-63 79.81, 15-72 73.23, 14-66 63.57, 11-57 77.21, 15-67
Coverage score component (mean, IQR) 0.571, 0.413-0.747 0.577, 0.425-0.75 0.586, 0.442-0.748 0.881, 0.839-1 0.884, 0.847-1 0.867, 0.852-1 0.896, 0.855-1
% of accurate bases (bases with accuracy higher than or equal to 0.95) per transcript (mean, IQR) 97%, 95.85%-98.62% 94.57%, 92.44%-97.1% 89.73%, 85.78%-94.09% 96.61%, 95.63%-98.09% 93.64%, 91.84%-95.9% 86.5%, 83.65%-89.57% 97.22%, 96.48%-98.55%
N, % bases with accuracy equal to or higher than 0.2 33319098, 99.92% 31605559, 99.86% 26927719, 99.71% 44268844, 99.97% 43523598, 99.95% 41533805, 99.91% 44893256, 99.97%
N, % bases with accuracy equal to or higher than 0.4 33307089, 99.89% 31581423, 99.78% 26880146, 99.54% 44262642, 99.95% 43515373, 99.93% 41519284, 99.87% 44887092, 99.95%
N, % bases with accuracy equal to or higher than 0.6 33259480, 99.74% 31507367, 99.55% 26770626, 99.13% 44234343, 99.89% 43479437, 99.85% 41469832, 99.75% 44859850, 99.89%
N, % bases with accuracy equal to or higher than 0.8 33109179, 99.29% 31285011, 98.84% 26459499, 97.98% 44129441, 99.65% 43346851, 99.54% 41285147, 99.31% 44759515, 99.67%
N, % bases with accuracy equal to or higher than 0.85 32934436, 98.77% 30992922, 97.92% 26027306, 96.38% 44033732, 99.44% 43202252, 99.21% 41039715, 98.72% 44673850, 99.48%
N, % bases with accuracy equal to or higher than 0.9 32666680, 97.97% 30515969, 96.41% 25248112, 93.5% 43854301, 99.03% 42890457, 98.5% 40394706, 97.17% 44520241, 99.13%
N, % bases with accuracy equal to or higher than 0.95 32028576, 96.05% 29338841, 92.69% 23196930, 85.9% 43000723, 97.1% 41168525, 94.54% 36560245, 87.94% 43815550, 97.57%
N, % bases with accuracy equal to or higher than 0.99 31365980, 94.07% 28189298, 89.06% 21427622, 79.35% 37599921, 84.91% 31576378, 72.51% 22262258, 53.55% 37393294, 83.26%
N, % bases with accuracy equal to or higher than 1 31352558, 94.02% 28176493, 89.02% 21421381, 79.32% 36216294, 81.78% 30221838, 69.4% 21581997, 51.91% 34987671, 77.91%
N, % assembly bases in LAR 1003712, 3.01% 2442506, 7.72% 7173993, 26.57% 989999, 2.24% 2489562, 5.72% 15745507, 37.88% 963507, 2.15%
% of bases in LAR per transcript (mean, IQR) 2.89%, 1.06%-3.55% 6.9%, 3.88%-8.92% 19.47%, 11.93%-26.45% 2.54%, 0.59%-2.91% 6.53%, 3.96%-7.59% 36.1%, 31.71%-41.26% 2.44%, 0.42%-2.81%
LAR length (bp) (mean, IQR) 4.01, 1-4 4.5, 1-5 7.42, 1-10 6.9, 1-6 4.93, 1-6 7.63, 1-10 7.71, 1-7
Accuracy score component (mean, IQR) 0.923, 0.906-0.944 0.882, 0.862-0.903 0.814, 0.785-0.837 0.941, 0.924-0.967 0.902, 0.889-0.923 0.781, 0.771-0.793 0.944, 0.924-0.972
Parameter RSP_0.01_11_20 RSP_0.02_11_20
% of covered bases per transcript (mean, IQR) 98.55%, 100%-100% 95.95%, 100%-100%
N, % fully covered transcripts 15216, 81.29% 15326, 78.87%
N, % fully uncovered transcripts 121, 0.65% 625, 3.22%
Mean coverage per transcript (mean, IQR) 42.83, 23.49-52.04 41.58, 22.53-51.52
N, % bases with coverage equal to or higher than 5 42240898, 93.73% 41001937, 94.47%
N, % bases with coverage equal to or higher than 10 39507341, 87.67% 38610311, 88.96%
N, % bases with coverage equal to or higher than 20 33457021, 74.24% 32940608, 75.9%
N, % bases with coverage equal to or higher than 40 19944930, 44.26% 19639668, 45.25%
N, % bases with coverage equal to or higher than 60 13553890, 30.08% 13406648, 30.89%
N, % bases with coverage equal to or higher than 80 8758759, 19.44% 8664520, 19.96%
N, % bases with coverage equal to or higher than 100 6142684, 13.63% 6116595, 14.09%
Maximum uncovered region length per transcript (bp) (mean, IQR) 23.09, 0-0 21.4, 0-0
Mean end coverage per transcript (mean, IQR) 9.66, 3.27-13.47 9.4, 3.43-13.11
N, % assembly bases in LCR 2137509, 4.74% 1805120, 4.16%
% of bases in LCR per transcript (mean, IQR) 4.69%, 0%-3.04% 7.71%, 0%-2.93%
LCR length (bp) (mean, IQR) 73.81, 13-64 64.68, 11-57
Coverage score component (mean, IQR) 0.895, 0.862-1 0.871, 0.865-1
% of accurate bases (bases with accuracy higher than or equal to 0.95) per transcript (mean, IQR) 94.7%, 93.38%-96.72% 87.94%, 85.47%-91.02%
N, % bases with accuracy equal to or higher than 0.2 44411567, 99.95% 42808085, 99.92%
N, % bases with accuracy equal to or higher than 0.4 44403231, 99.93% 42795033, 99.89%
N, % bases with accuracy equal to or higher than 0.6 44367949, 99.85% 42748872, 99.78%
N, % bases with accuracy equal to or higher than 0.8 44242300, 99.57% 42579816, 99.39%
N, % bases with accuracy equal to or higher than 0.85 44116752, 99.29% 42373504, 98.91%
N, % bases with accuracy equal to or higher than 0.9 43859221, 98.71% 41836868, 97.65%
N, % bases with accuracy equal to or higher than 0.95 42379318, 95.38% 38233483, 89.24%
N, % bases with accuracy equal to or higher than 0.99 30711291, 69.12% 20727304, 48.38%
N, % bases with accuracy equal to or higher than 1 28445921, 64.02% 19669286, 45.91%
N, % assembly bases in LAR 2306315, 5.19% 16917234, 39.49%
% of bases in LAR per transcript (mean, IQR) 5.92%, 3.27%-6.92% 38.22%, 34.13%-43.06%
LAR length (bp) (mean, IQR) 5.07, 1-6 7.56, 1-10
Accuracy score component (mean, IQR) 0.908, 0.893-0.93 0.778, 0.768-0.79

IQR = interquartile range, LCR = low-coverage region, LAR = low-accuracy region

Figure 2. Per-base coverage category distribution.

Figure 3. Proportion of covered bases per transcript category distribution.

Figure 4. Mean transcript coverage category distribution.

Figure 5. Positional relative coverage distribution.

Figure 6. Maximum uncovered region length per transcript distribution.

Figure 7. Mean transcript end coverage per transcript distribution.

Figure 8. Proportion of bases in low-coverage regions per transcript category distribution.

Figure 9. Low-coverage region length distribution.

Figure 10. Coverage score component distribution.

Figure 11. Per-base accuracy category distribution.

Figure 12. Proportion of accurate bases per transcript category distribution.

Figure 13. Positional accuracy distribution.

Figure 14. Proportion of bases in low-accuracy regions per transcript category distribution.

Figure 15. Low-accuracy region length distribution.

Figure 16. Accuracy score component distribution.

Paired-end read analysis

Table 4. Local fidelity and integrity statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
N, % reads with pair not mapped to the assembly 44675, 1.23% 66084, 1.84% 117366, 3.5% 41293, 0.22% 51624, 0.28% 96419, 0.53% 51416, 0.21%
N, % reads with pair not mapped to the assembly on transcript ends 16728, 10.02% 26398, 13.32% 52156, 19.51% 8223, 2.27% 9713, 2.62% 16557, 4.28% 11085, 2.43%
% of reads with pair not mapped to the assembly on transcript ends per transcript (mean, IQR) 11.55%, 0%-20% 15.33%, 0%-25% 22.31%, 0%-33.33% 2.5%, 0%-0% 3.5%, 0%-0% 6.08%, 0%-7.69% 2.31%, 0%-0%
N, % reads with pair mapped in an unexpected orientation 4, 0% 14, 0% 0, 0% 638, 0% 2, 0% 542, 0% 62, 0%
N, % reads with pair mapped too far apart 6034, 0.17% 9344, 0.26% 30920, 0.92% 20788, 0.11% 21018, 0.11% 22172, 0.12% 27526, 0.11%
N, % improperly paired reads within a transcript 50713, 1.39% 75442, 2.11% 148286, 4.42% 62719, 0.34% 72644, 0.39% 119133, 0.65% 79004, 0.33%
% of improperly paired reads within a transcript per transcript (mean, IQR) 7.22%, 0%-9.09% 10.55%, 0.28%-14.29% 17.63%, 3.45%-25% 0.43%, 0%-0.23% 0.63%, 0%-0.38% 1.46%, 0%-0.97% 0.41%, 0%-0.19%
Local fidelity score component (mean, IQR) 0.821, 0.698-1 0.758, 0.622-0.948 0.647, 0.5-0.816 0.969, 0.952-1 0.958, 0.938-1 0.923, 0.901-1 0.97, 0.956-1
N, % reads with pair mapped to another transcript 92682, 2.54% 105814, 2.95% 137456, 4.1% 288950, 1.56% 287580, 1.56% 275434, 1.51% 379204, 1.58%
% of reads with pair mapped to another transcript per transcript (mean, IQR) 14.39%, 0%-22.22% 15.72%, 0%-25% 17.42%, 0%-26.67% 3.23%, 0%-2.68% 3.41%, 0%-2.73% 4.86%, 0%-2.92% 3.25%, 0%-2.82%
N, % fragmented transcripts 571, 2.06% 818, 2.67% 1499, 4.09% 727, 3.91% 704, 3.77% 707, 3.68% 808, 4.33%
N, % reads representing bridging events on transcript ends 9074, 5.43% 12520, 6.32% 20270, 7.58% 6614, 1.83% 6212, 1.68% 6478, 1.67% 8486, 1.86%
% of reads representing bridging events on transcript ends per transcript (mean, IQR) 6.36%, 0%-0% 7.19%, 0%-12.5% 8.24%, 0%-14.29% 1.87%, 0%-0% 1.75%, 0%-0% 1.82%, 0%-0% 1.8%, 0%-0%
Integrity score component (mean, IQR) 0.714, 0.493-1 0.685, 0.45-1 0.646, 0.408-1 0.92, 0.891-1 0.917, 0.89-1 0.898, 0.885-1 0.92, 0.888-1
Parameter RSP_0.01_11_20 RSP_0.02_11_20
N, % reads with pair not mapped to the assembly 60936, 0.25% 110944, 0.47%
N, % reads with pair not mapped to the assembly on transcript ends 10830, 2.32% 17004, 3.53%
% of reads with pair not mapped to the assembly on transcript ends per transcript (mean, IQR) 2.96%, 0%-0% 4.95%, 0%-5.26%
N, % reads with pair mapped in an unexpected orientation 44, 0% 70, 0%
N, % reads with pair mapped too far apart 27088, 0.11% 27766, 0.12%
N, % improperly paired reads within a transcript 88068, 0.37% 138780, 0.58%
% of improperly paired reads within a transcript per transcript (mean, IQR) 0.51%, 0%-0.32% 1.07%, 0%-0.74%
Local fidelity score component (mean, IQR) 0.962, 0.944-1 0.934, 0.914-1
N, % reads with pair mapped to another transcript 378110, 1.57% 367392, 1.54%
% of reads with pair mapped to another transcript per transcript (mean, IQR) 3.48%, 0%-2.76% 4.61%, 0%-3%
N, % fragmented transcripts 802, 4.28% 806, 4.15%
N, % reads representing bridging events on transcript ends 8146, 1.75% 8818, 1.83%
% of reads representing bridging events on transcript ends per transcript (mean, IQR) 1.79%, 0%-0% 1.83%, 0%-0%
Integrity score component (mean, IQR) 0.917, 0.889-1 0.903, 0.883-1

IQR = interquartile range

Figure 17. Per-transcript proportion of improperly paired reads within a transcript category distribution.

Figure 18. Local fidelity score component distribution.

Figure 19. Per-transcript proportion of reads with pair mapped to another transcript category distribution.

Figure 20. Integrity score component distribution.